deeponet model
Active operator learning with predictive uncertainty quantification for partial differential equations
Winovich, Nick, Daneker, Mitchell, Lu, Lu, Lin, Guang
In this work, we develop a method for uncertainty quantification in deep operator networks (DeepONets) using predictive uncertainty estimates calibrated to model errors observed during training. The uncertainty framework operates using a single network, in contrast to existing ensemble approaches, and introduces minimal overhead during training and inference. We also introduce an optimized implementation for DeepONet inference (reducing evaluation times by a factor of five) to provide models well-suited for real-time applications. We evaluate the uncertainty-equipped models on a series of partial differential equation (PDE) problems, and show that the model predictions are unbiased, non-skewed, and accurately reproduce solutions to the PDEs. To assess how well the models generalize, we evaluate the network predictions and uncertainty estimates on in-distribution and out-of-distribution test datasets. We find the predictive uncertainties accurately reflect the observed model errors over a range of problems with varying complexity; simpler out-of-distribution examples are assigned low uncertainty estimates, consistent with the observed errors, while more complex out-of-distribution examples are properly assigned higher uncertainties. We also provide a statistical analysis of the predictive uncertainties and verify that these estimates are well-aligned with the observed error distributions at the tail-end of training. Finally, we demonstrate how predictive uncertainties can be used within an active learning framework to yield improvements in accuracy and data-efficiency for outer-loop optimization procedures.
Virtual Sensing-Enabled Digital Twin Framework for Real-Time Monitoring of Nuclear Systems Leveraging Deep Neural Operators
Hossain, Raisa Bentay, Ahmed, Farid, Kobayashi, Kazuma, Koric, Seid, Abueidda, Diab, Alam, Syed Bahauddin
Effective real-time monitoring is a foundation of digital twin technology, crucial for detecting material degradation and maintaining the structural integrity of nuclear systems to ensure both safety and operational efficiency. Traditional physical sensor systems face limitations such as installation challenges, high costs, and difficulty measuring critical parameters in hard-to-reach or harsh environments, often resulting in incomplete data coverage. Machine learning-driven virtual sensors, integrated within a digital twin framework, offer a transformative solution by enhancing physical sensor capabilities to monitor critical degradation indicators like pressure, velocity, and turbulence. However, conventional machine learning models struggle with real-time monitoring due to the high-dimensional nature of reactor data and the need for frequent retraining. This paper introduces the use of Deep Operator Networks (DeepONet) as a core component of a digital twin framework to predict key thermal-hydraulic parameters in the hot leg of an AP-1000 Pressurized Water Reactor (PWR). DeepONet serves as a dynamic and scalable virtual sensor by accurately mapping the interplay between operational input parameters and spatially distributed system behaviors. In this study, DeepONet is trained with different operational conditions, which relaxes the requirement of continuous retraining, making it suitable for online and real-time prediction components for digital twin. Our results show that DeepONet achieves accurate predictions with low mean squared error and relative L2 error and can make predictions on unknown data 1400 times faster than traditional CFD simulations. This speed and accuracy enable DeepONet to synchronize with the physical system in real-time, functioning as a dynamic virtual sensor that tracks degradation-contributing conditions.
Super Resolution Based on Deep Operator Networks
We use Deep Operator Networks (DeepONets) to perform super-resolution reconstruction of the solutions of two types of partial differential equations and compare the model predictions with the results obtained using conventional interpolation methods to verify the advantages of DeepONets. We employ two pooling methods to downsample the origin data and conduct super-resolution reconstruction under three different resolutions of input images. The results show that the DeepONet model can predict high-frequency oscillations and small-scale structures from low-resolution inputs very well. For the two-dimensional problem, we introduce convolutional layers to extract information from input images at a lower cost than purer MLPs. We adjust the size of the training set and observe the variation of prediction errors. In both one-dimensional and two-dimensional cases, the super-resolution reconstruction using the DeepONet model demonstrates much more accurate prediction results than cubic spline interpolation, highlighting the superiority of operator learning methods in handling such problems compared to traditional interpolation techniques.
A novel data generation scheme for surrogate modelling with deep operator networks
Choubey, Shivam, Pal, Birupaksha, Agrawal, Manish
However, due to intensive computational requirements, it is not feasible to deploy these techniques directly in numerous cases, such as parametric optimization, real-time prediction for control applications, etc. Machine learning-based surrogate models offer an alternate way for simulation of the physical systems in an efficient manner. Deep learning, due to its ability to model any arbitrary input-output relationship in an efficient manner is the most accepted choice for surrogate modelling. In general, these surrogate models are data driven models, where the simulation/experimental data is used for the training purpose. Once the surrogate model is trained, it can be used to predict the system output for unobserved data with minimal computational effort. For surrogate modelling, both vanilla and specialized neural networks such as convolution neural networks have gained immense popularity in both scientific as well as for industrial applications [1, 2]. Further, recently in [3], operator learning, a new paradigm in deep learning is proposed. In literature, various operator learning techniques are proposed, like deep operator networks (DeepONets)[4], Laplace Neural operators (LNO)[5], Fourier Neural operators (FNO)[6] and General Neural Operator Transformer for Operator learning (GNOT)[7]. In this paper, we focus on DeepONets as an operator learning technique and show a novel way on how to reduce the computational cost associated with training the model. DeepONet is based on the lesser known cousin of the'Universal Approximation
Potential of Deep Operator Networks in Digital Twin-enabling Technology for Nuclear System
Kobayashi, Kazuma, Alam, Syed Bahauddin
This research introduces the Deep Operator Network (DeepONet) as a robust surrogate modeling method within the context of digital twin (DT) systems for nuclear engineering. With the increasing importance of nuclear energy as a carbon-neutral solution, adopting DT technology has become crucial to enhancing operational efficiencies, safety, and predictive capabilities in nuclear engineering applications. DeepONet exhibits remarkable prediction accuracy, outperforming traditional ML methods. Through extensive benchmarking and evaluation, this study showcases the scalability and computational efficiency of DeepONet in solving a challenging particle transport problem. By taking functions as input data and constructing the operator $G$ from training data, DeepONet can handle diverse and complex scenarios effectively. However, the application of DeepONet also reveals challenges related to optimal sensor placement and model evaluation, critical aspects of real-world implementation. Addressing these challenges will further enhance the method's practicality and reliability. Overall, DeepONet presents a promising and transformative tool for nuclear engineering research and applications. Its accurate prediction and computational efficiency capabilities can revolutionize DT systems, advancing nuclear engineering research. This study marks an important step towards harnessing the power of surrogate modeling techniques in critical engineering domains.
Learning bias corrections for climate models using deep neural operators
Bora, Aniruddha, Shukla, Khemraj, Zhang, Shixuan, Harrop, Bryce, Leung, Ruby, Karniadakis, George Em
Numerical simulation for climate modeling resolving all important scales is a computationally taxing process. Therefore, to circumvent this issue a low resolution simulation is performed, which is subsequently corrected for bias using reanalyzed data (ERA5), known as nudging correction. The existing implementation for nudging correction uses a relaxation based method for the algebraic difference between low resolution and ERA5 data. In this study, we replace the bias correction process with a surrogate model based on the Deep Operator Network (DeepONet). DeepONet (Deep Operator Neural Network) learns the mapping from the state before nudging (a functional) to the nudging tendency (another functional). The nudging tendency is a very high dimensional data albeit having many low energy modes. Therefore, the DeepoNet is combined with a convolution based auto-encoder-decoder (AED) architecture in order to learn the nudging tendency in a lower dimensional latent space efficiently. The accuracy of the DeepONet model is tested against the nudging tendency obtained from the E3SMv2 (Energy Exascale Earth System Model) and shows good agreement. The overarching goal of this work is to deploy the DeepONet model in an online setting and replace the nudging module in the E3SM loop for better efficiency and accuracy.
A Hybrid Deep Neural Operator/Finite Element Method for Ice-Sheet Modeling
He, QiZhi, Perego, Mauro, Howard, Amanda A., Karniadakis, George Em, Stinis, Panos
One of the most challenging and consequential problems in climate modeling is to provide probabilistic projections of sea level rise. A large part of the uncertainty of sea level projections is due to uncertainty in ice sheet dynamics. At the moment, accurate quantification of the uncertainty is hindered by the cost of ice sheet computational models. In this work, we develop a hybrid approach to approximate existing ice sheet computational models at a fraction of their cost. Our approach consists of replacing the finite element model for the momentum equations for the ice velocity, the most expensive part of an ice sheet model, with a Deep Operator Network, while retaining a classic finite element discretization for the evolution of the ice thickness. We show that the resulting hybrid model is very accurate and it is an order of magnitude faster than the traditional finite element model. Further, a distinctive feature of the proposed model compared to other neural network approaches, is that it can handle high-dimensional parameter spaces (parameter fields) such as the basal friction at the bed of the glacier, and can therefore be used for generating samples for uncertainty quantification. We study the impact of hyper-parameters, number of unknowns and correlation length of the parameter distribution on the training and accuracy of the Deep Operator Network on a synthetic ice sheet model. We then target the evolution of the Humboldt glacier in Greenland and show that our hybrid model can provide accurate statistics of the glacier mass loss and can be effectively used to accelerate the quantification of uncertainty.
Improved architectures and training algorithms for deep operator networks
Wang, Sifan, Wang, Hanwen, Perdikaris, Paris
Operator learning techniques have recently emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces. Trained under appropriate constraints, they can also be effective in learning the solution operator of partial differential equations (PDEs) in an entirely self-supervised manner. In this work we analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel (NTK) theory, and reveal a bias that favors the approximation of functions with larger magnitudes. To correct this bias we propose to adaptively re-weight the importance of each training example, and demonstrate how this procedure can effectively balance the magnitude of back-propagated gradients during training via gradient descent. We also propose a novel network architecture that is more resilient to vanishing gradient pathologies. Taken together, our developments provide new insights into the training of DeepONets and consistently improve their predictive accuracy by a factor of 10-50x, demonstrated in the challenging setting of learning PDE solution operators in the absence of paired input-output observations. All code and data accompanying this manuscript are publicly available at \url{https://github.com/PredictiveIntelligenceLab/ImprovedDeepONets.}
Learning the solution operator of parametric partial differential equations with physics-informed DeepOnets
Wang, Sifan, Wang, Hanwen, Perdikaris, Paris
Deep operator networks (DeepONets) are receiving increased attention thanks to their demonstrated capability to approximate nonlinear operators between infinite-dimensional Banach spaces. However, despite their remarkable early promise, they typically require large training data-sets consisting of paired input-output observations which may be expensive to obtain, while their predictions may not be consistent with the underlying physical principles that generated the observed data. In this work, we propose a novel model class coined as physics-informed DeepONets, which introduces an effective regularization mechanism for biasing the outputs of DeepOnet models towards ensuring physical consistency. This is accomplished by leveraging automatic differentiation to impose the underlying physical laws via soft penalty constraints during model training. We demonstrate that this simple, yet remarkably effective extension can not only yield a significant improvement in the predictive accuracy of DeepOnets, but also greatly reduce the need for large training data-sets. To this end, a remarkable observation is that physics-informed DeepONets are capable of solving parametric partial differential equations (PDEs) without any paired input-output observations, except for a set of given initial or boundary conditions. We illustrate the effectiveness of the proposed framework through a series of comprehensive numerical studies across various types of PDEs. Strikingly, a trained physics informed DeepOnet model can predict the solution of $\mathcal{O}(10^3)$ time-dependent PDEs in a fraction of a second -- up to three orders of magnitude faster compared a conventional PDE solver. The data and code accompanying this manuscript are publicly available at \url{https://github.com/PredictiveIntelligenceLab/Physics-informed-DeepONets}.